15 research outputs found

    Improving Grounded Natural Language Understanding through Human-Robot Dialog

    Full text link
    Natural language understanding for robotics can require substantial domain- and platform-specific engineering. For example, for mobile robots to pick-and-place objects in an environment to satisfy human commands, we can specify the language humans use to issue such commands, and connect concept words like red can to physical object properties. One way to alleviate this engineering for a new domain is to enable robots in human environments to adapt dynamically---continually learning new language constructions and perceptual concepts. In this work, we present an end-to-end pipeline for translating natural language commands to discrete robot actions, and use clarification dialogs to jointly improve language parsing and concept grounding. We train and evaluate this agent in a virtual setting on Amazon Mechanical Turk, and we transfer the learned agent to a physical robot platform to demonstrate it in the real world

    Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

    Get PDF
    Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback. Despite many advances over the past three decades, learning in many domains still requires a large amount of interaction with the environment, which can be prohibitively expensive in realistic scenarios. To address this problem, transfer learning has been applied to reinforcement learning such that experience gained in one task can be leveraged when starting to learn the next, harder task. More recently, several lines of research have explored how tasks, or data samples themselves, can be sequenced into a curriculum for the purpose of learning a problem that may otherwise be too difficult to learn from scratch. In this article, we present a framework for curriculum learning (CL) in reinforcement learning, and use it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals. Finally, we use our framework to find open problems and suggest directions for future RL curriculum learning research

    BWIBots: A platform for bridging the gap between AI and human–robot interaction research

    Get PDF
    Recent progress in both AI and robotics have enabled the development of general purpose robot platforms that are capable of executing a wide variety of complex, temporally extended service tasks in open environments. This article introduces a novel, custom-designed multi-robot platform for research on AI, robotics, and especially human–robot interaction for service robots. Called BWIBots, the robots were designed as a part of the Building-Wide Intelligence (BWI) project at the University of Texas at Austin. The article begins with a description of, and justification for, the hardware and software design decisions underlying the BWIBots, with the aim of informing the design of such platforms in the future. It then proceeds to present an overview of various research contributions that have enabled the BWIBots to better (a) execute action sequences to complete user requests, (b) efficiently ask questions to resolve user requests, (c) understand human commands given in natural language, and (d) understand human intention from afar. The article concludes with a look forward towards future research opportunities and applications enabled by the BWIBot platform

    Contact Force and Scanning Velocity during Active Roughness Perception

    Get PDF
    Haptic perception is bidirectionally related to exploratory movements, which means that exploration influences perception, but perception also influences exploration. We can optimize or change exploratory movements according to the perception and/or the task, consciously or unconsciously. This paper presents a psychophysical experiment on active roughness perception to investigate movement changes as the haptic task changes. Exerted normal force and scanning velocity are measured in different perceptual tasks (discrimination or identification) using rough and smooth stimuli. The results show that humans use a greater variation in contact force for the smooth stimuli than for the rough stimuli. Moreover, they use higher scanning velocities and shorter break times between stimuli in the discrimination task than in the identification task. Thus, in roughness perception humans spontaneously use different strategies that seem effective for the perceptual task and the stimuli. A control task, in which the participants just explore the stimuli without any perceptual objective, shows that humans use a smaller contact force and a lower scanning velocity for the rough stimuli than for the smooth stimuli. Possibly, these strategies are related to aversiveness while exploring stimuli

    Learning inter-task transferability in the absence of target task samples

    No full text
    In a reinforcement learning setting, the goal of transfer learning is to improve performance on a target task by re-using knowledge from one or more source tasks. A key problem in transfer learning is how to choose appropriate source tasks for a given target task. Current approaches typically require that the agent has some experience in the target domain, or that the target task is specified by a model (e.g., a Markov Decision Process) with known parameters. To address these limitations, this paper proposes a framework for selecting source tasks in the absence of a known model or target task samples. Instead, our approach uses meta-data (e.g., attribute-value pairs) associated with each task to learn the expected benefit of transfer given a source-target task pair. To test the method, we conducted a large-scale experiment in the Ms. Pac-Man domain in which an agent played over 170 million games spanning 192 variations of the task. The agent used vast amounts of experience about transfer learning in the domain to model the benefit (or detriment) of transferring knowledge from one task to another. Subsequently, the agent successfully selected appropriate source tasks for previously unseen target tasks

    Source Task Creation for Curriculum Learning

    No full text
    Transfer learning in reinforcement learning has been an active area of research over the past decade. In transfer learning, training on a source task is leveraged to speed up or otherwise improve learning on a target task. This paper presents the more ambitious problem of curriculum learning in reinforcement learning, in which the goal is to design a sequence of source tasks for an agent to train on, such that final performance or learning speed is improved. We take the position that each stage of such a curriculum should be tailored to the current ability of the agent in order to promote learning new behaviors. Thus, as a first step towards creating a curriculum, the trainer must be able to create novel, agent-specific source tasks. We explore how such a space of useful tasks can be created using a parameterized model of the domain and observed trajectories on the target task. We experimentally show that these methods can be used to form components of a curriculum and that such a curriculum can be used successfully for transfer learning in 2 challenging multiagent reinforcement learning domains

    Automatic Curriculum Graph Generation for Reinforcement Learning Agents

    No full text
    In recent years, research has shown that transfer learning methods can be leveraged to construct curricula that sequence a series of simpler tasks such that performance on a final target task is improved. A major limitation of existing approaches is that such curricula are handcrafted by humans that are typically domain experts. To address this limitation, we introduce a method to generate a curriculum based on task descriptors and a novel metric of transfer potential. Our method automatically generates a curriculum as a directed acyclic graph (as opposed to a linear sequence as done in existing work). Experiments in both discrete and continuous domains show that our method produces curricula that improve the agent’s learning performance when compared to the baseline condition of learning on the target task from scratch
    corecore